Two-level grid-based anomaly area identification and solution nomination for radio access networks

ABSTRACT

A system can include a network analysis platform for a two-level grid-based anomaly area identification and solution nomination in a radio access network. The network analysis platform can map key performance indicators for user sessions in the network to a grid that overlays a geographic area. The grid can be based on a military grid reference system. A machine-learning model can take a vector of key performance indicator samples as input and identify a problem for the grid. The network analysis platform can nominate cells to attempt to remediate based on ranking poor performing bins in the grid and determining the cells that contribute most to the problem in each bin. For a nominated cell, the network analysis platform can perform remediation actions to solve a coverage problem, throughput problem, or both. The problem can be solved for the grid while preventing conflicts between individual cells.

BACKGROUND

Radio access networks (“RANs”), such as LTE or 5G networks, requireconstant monitoring to detect problems with performance and capacity.For example, poor signal strength, poor signal quality, or high trafficdemand all can adversely impact user experience.

Normally, problems are identified in a radio access network on a cellbasis. For example, an administrator can identify a problem and thenattempt to draw a polygon around the problem spots. A troubleshooter cancheck a serving cell in the problem spot, then adjust parameters of thatcell as needed. Parameters can also be adjusted for neighboring cells toattempt to load balance user sessions across the cells. Antennae tiltcan also be adjusted for the cells as an attempt to change interferencebetween nearby cells and improve signal strength and quality. Such anapproach can also be used to redesign the polygon for the problem area,although the resulting effect is often uncertain.

These existing techniques have several shortcomings. First, the work canbe very manual, relying heavily on human trial and error and on the mapformat for polygon visualization. Second, most existing techniques lackan overall picture of the problem. For example, a coverage problem mayresult in a recommendation to uptilt an antenna of a cell withoutacknowledging that the cell is already more overloaded than aneighboring cell. In such a case it may be better to make theneighboring cell the serving cell for the problem location. Similarly,throughput may be increased on a first cell even though a secondneighboring cell is underutilized. Third, as the network dynamicallychanges parameters, conflicts can occur. Parameters can dynamicallychange back and forth between cells without ever truly satisfying theissue, instead shifting the issue back-and-forth between cells.

SUMMARY

Example implementations of technology described herein include systemsand methods for area-based anomaly identification and solutionnomination in a radio access network. The area-based technique can usetwo-level grids, with the first level further divided into smaller gridsor “bins” on a second level. Unlike prior methods, the method can allowfor area-based problem identification prior to attempting to fixindividual cells while resolving some conflicts between cells whenadjusting cell parameters. In one example, a network analysis platformcan execute on a physical hardware server as part of a RAN architecture.The network analysis platform can identify problems for an entire areaand recommend remedial actions.

In one example, the network analysis platform can map key performanceindicator (“KPI”) samples from user sessions to individual bins of agrid. The grid can be part of a larger matrix that is applied to ageographic region, dividing the region into bins. For example, each gridcan be a ten-by-ten collection of bins, although grid granularity can beset by administrator in an example. The grid can be produced as part ofa military grid reference system, which can include a matrix mapped tothe planet's geography. The grid can be a first level of that system,with the bins being a nested second level within the grid. The grid andbins within the matrix can adhere to a numbering system that allows foreasy identification of adjacent grids and bins. For example, with a gridreference system, additional digits can be added to reference binswithin a grid, and the grids themselves can be referenced within thematrix with the same numbering system. To zoom in and out on the levels(e.g., grid, bin, sub-bin), different numbers of digits can be used onthe reference numbers. In this way, each bin can be a grid within themain grid. The KPI samples can be organized with respect to these binsover a time period, such as one hour. This time period can dictate thefrequency with which the network analysis platform attempts to identifya problem in the grid.

To identify a problem, the network analysis platform can provide valuesbased on the mapped KPI samples as inputs to a machine learning model.This can include considering the worst KPI samples of the bins (i.e., onthe second level), such as the bottom 5%, and taking an average or someother summarization of those samples. The first-level grid itself can beprovided as an input, such as by providing vectors of values per bin asinputs to the machine learning model. The machine learning model can bepretrained based on historical data for the region of the grid, and canoutput an indication that a problem with coverage or throughput existsfor the grid.

When a problem exists, the network analysis platform can also determineoffender bins within the grid. The offender bins can include KPI samplesindicative of poor performance relative to other bins. In an example,the worst bins are identified based on ranking the bins according to themapped KPI samples. For example, each bin within the grid can beassigned a weight based on ranking, with the bin having the worstperformance KPI samples being accorded the highest weight. A certainnumber of bins with the highest weights can be identified as offenderbins. Although “highest” weights is used for convenience, the weightscan instead be lowest in another example that distinguishes poorperformance with lower weights.

For each identified offender bin, the network analysis platform candetermine which serving cells in that bin have the highest percent ofsamples and which serving cells have the most percent of samples ofthose with the worst KPIs. For example, the worst KPIs can be the 5%lowest for each bin, in an example. The network analysis platform canthen rank the cells and nominate a certain number of worst-ranked cellswithin the whole grid for remediation.

The network analysis platform can then apply a remedial action to one ormore of the nominated serving cells. This can include looping throughall the nominated serving cells and for each cell determining whatremedial actions to take. The remedial actions can be selected by amachine learning model or can be based on KPI values in conjunction withthe type of problem (e.g., coverage or throughput) detected for theoverall grid. Suggested remedial actions can be displayed to anadministrator on a console graphical user interface, in an example.Alternatively, the network analysis platform can automate the fixes bysending commands to the nominated cells or neighboring cells. Thecommands can be sent over a network and implement applicationprogramming interfaces for the nominated cells.

Since a set of cells are nominated to fix the problem for the wholegrid, the solution can be provided in a coordinated way, which helpsresolve some conflicts that otherwise occur when changing parameters ofindividual cells in isolation. The network analysis platform canconsider the remedial actions for all of the nominated cells together,such that a change to one cell does not further exacerbate an issue foranother nominated cell. In addition, multiple grids (i.e., on the firstlevel) can be analyzed and remedial solutions for adjacent grids canaccount for conflicts. For example, if the same target cell exists forboth grids, the network analysis platform can ensure that the remedialactions for that target cell are consistent. Likewise, target cells of afirst grid can be considered when they are neighboring cells of targetcells in a second grid.

The examples summarized above can each be incorporated into anon-transitory, computer-readable medium having instructions that, whenexecuted by a processor associated with a computing device, cause theprocessor to perform the stages described. Additionally, the examplemethods summarized above can each be implemented in a system including,for example, a memory storage and a computing device having a processorthat executes instructions to carry out the stages described.

Both the foregoing general description and the following detaileddescription are exemplary and explanatory only and are not restrictiveof the examples, as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a flowchart of an example method for grid-based anomaly areaidentification and solution nomination in a radio access network.

FIG. 2 is a sequence diagram of an example method for grid-based anomalyarea identification and solution nomination in a radio access network.

FIG. 3A is a flowchart of an example method for performing actions fornominated cells in a radio access network.

FIG. 3B is a flowchart of an example method for performing actions fornominated cells in a radio access network.

FIG. 4 is an illustration of an example GUI screen for settingparameters used in grid-based anomaly area identification and solutionnomination in a radio access network.

FIG. 5 is an illustration of an example system for grid-based anomalyarea identification and solution nomination in a radio access network.

DESCRIPTION OF THE EXAMPLES

Reference will now be made in detail to the present examples, includingexamples illustrated in the accompanying drawings. Wherever possible,the same reference numbers will be used throughout the drawings to referto the same or like parts.

A system for two-level grid-based anomaly detection can identify aproblem in a radio access network (“RAN”) based on examining informationacross a grid. A network analysis platform can apply a grid to ageographic region. For example, the military grid reference system(“MGRS”) can be used for the grid, allowing for customized second-levelbin granularity within the first-level grid. The grid can be a firstlevel within the MGRS matrix, with the MGRS bins being a second levelnested within the MGRS grid. Additional sub-bins can be nested at stilllower levels within each bin, allowing for zooming in and out from agranularity perspective, in an example. Information samples describinguser sessions can be collected with respect to the bins in the grid(e.g., second-level MGRS bins for a first-level MGRS grid). Other gridsystems besides MGRS are also possible.

User sessions can be serviced by one of multiple serving cells thattransmit into the area represented by the grid. The network analysisplatform can summarize the samples within the bins of the grid over atime period, such as an hour. For example, an average of the worst fivepercent of samples in each bin can be calculated. By supplying summariesof the samples in the grid format (such in a matrix) to a machinelearning (“ML”) model, the model can output whether the grid currentlyhas a coverage problem, throughput problem, both, or neither.

When a problem exists, the network analysis platform can use the grid todetermine which cells to adjust in a remediation attempt to fix theproblem. For example, the bins can be ranked based on the average of theworst 5% of KPIs. Identified offender bins having the worst performancerankings (i.e., highest or lowest, depending on the example) can then beused to identify cells for potential remediation. For each offender bin,cells that contribute the most to the poor KPIs of that bin can beidentified. The identified cells from each offender bin can then beranked across the grid, with the worst ranked cells be nominated forremediation. After that, the nominated cells can be divided intodifferent remediation cases that can dictate whether the networkanalysis platform adjusts a cell's tilt, transmission power, or loadbalancing parameters.

Addressing RAN issues in this way with a two-level grid-based approachcan provide more wholistic diagnoses than traditional approaches.Instead of focusing on a single cell, the entire area can be taken intoaccount and different cells can be nominated for adjustment using thegrid-divided samples from the user sessions. With the advent and rolloutof 5G technology, grid-based analysis can take advantage of the hugeamount of data expected with internet of things (“IoT”) and 5G devices.Analyzing the entire grid at once will allow for more accurate problemrecognition and remediation.

FIG. 1 is a flowchart of an example method for example method for twolevel grid-based anomaly area identification and solution nomination ina RAN. At stage 110, the network analysis platform can map KPI samplesof user sessions to individual bins of a grid. The bins can be on asecond level of the grid. The grid can be applied to a geographic regionand can be rectangular in nature. In one example, the network analyticsplatform includes a console, allowing an administrative user to set theboundaries of the grid in relation to a geographical map. This caninclude dragging the grid or setting longitude and latitude values forthe grid. Additionally, the administrator can adjust the granularity ofthe bins. For example, the grid can be a 1 kilometer (“km”) squarewhereas the individual bins can be 100 meter (“m”) squares within thegrid. The grid itself can be selected for analysis but can exist withinan even larger grid. For example, the grid being analyzed can be asquare kilometer, with 100 m bins, but the square kilometer can benested within a 100 square kilometer grid area. The grid for analysiscan be selected by an administrator on based on granularity settings ofthe network analysis system, in an example. A system such as MGRS caninclude numbering that naturally allows for such nesting based on thelevel of zoom that the administrator desires. As explained herein, thebins at the second level can be used to determine problems in the gridat the first level.

In one example, the network analysis platform can utilize MGRS for thegrid and bins. MGRS can include a numbering system for latitude andlongitude that allows for adjusting bin granularity by adding digits tothe numbers. For example, a four-digit MGRS ID can indicate a 1000 mbin, whereas a six-digit MGRS ID can indicate a 100 m square. MGRS isalso used as a geocode for planet earth and can be mapped to ageographic region using the geocode. The use of MGRS or a similar gridsystem can allow for easily selecting granularity of bins within grids,and granularity of analysis grids within an overall larger grid.

The network analysis platform can collect metrics from user sessions andmap them to the bins of the grid being analyzed. The metrics can be KPIsamples that indicate a location, a serving cell identifier, andinformation about the session performance. For example, the KPI samplescan indicate reference signal received power (“RSRP”), reference signalreceived quality (“RSRQ”), signal to noise ratio (“SINR”), throughput,traffic volume, interference levels, and other performance-relatedinformation.

Based on the location information for the KPI sample, each sample can beassigned to a bin of the grid. In one example, this can be done byconverting coordinates, e.g., global positioning satellite (“GPS”)coordinates, to an MGRS ID. In another example, the KPI sample caninclude an MGRS ID.

For analysis purposes, the KPI samples can be organized relative to thebins during a time interval. For example, the collection frequency canbe per hour. However, the time interval is also configurable using theconsole of the network analysis platform, in an example. For example,the collection frequency can be set to 15 minutes or some other timeinterval.

At stage 120, to determine if a problem exists in the grid, the networkanalysis platform can provide values based on the mapped KPI samples asinputs to an ML model. In one example, a summary of KPI samples for eachbin can be used, with each bin organized as a vector for analysis. Thewhole grid can be a matrix with each bin as one value (or vector ofvalues) in the matric. This two-level grid can be used as an input tothe ML model. For example, a matrix of values representing the grid canbe an input to the ML model.

Each bin can include a vector with a summary of the KPI samples. Forexample, the vector for each bin can be in the form of [RSRP,throughput, day, hour]. The RSRP for a bin can be the average of theworst 5% RSRP samples in that day and hour. Similarly, the throughputfor a bin can be the average of the worst 5% throughput samples in thatday and hour. RSRP can be used to determine coverage problems andthroughput can be used to determine throughput problems. Other KPIs notincluded in the vector, such as SINR, can still be utilized later duringproblem cell nomination.

The ML model can use the matrix of vectors or other values to output anindication of whether a problem with coverage or throughput exists forthe grid. The output can represent a problem with coverage, throughput,both, or neither. The output can apply to the entire grid, such that theML model recognizes whether the corresponding area has the problem.Unlike a prior system where an administrator must start with a servingcell that they believe is problematic, the ML model can identify anentire area as having a problem prior to attempting to remediate anyparticular cell that serves the area. When analyzing a market, country,or state, there can be many grids. Each can be applied to the ML modelto analyze, for example, 1 km at a time, in an example.

To determine whether the grid has a problem, the ML model can considerwhich bins have a problem and how severe the problem is for those bins.Additionally, the ML model can analyze whether the problems last formultiple time periods (e.g., multiple hours) and whether the bins areconnected or scattered. With an MGRS grid, the labelling convention canidentify neighboring bins. For example, for 100 m bins, ID numbers954926 and 954927 are neighboring cells. For 1 km bins, 9692 iscontiguous with and on the east side of 9692. The scattered or groupednature of the problematic bins can be used to determine if the grid hasa problem.

The ML model can be trained based on historical user session performancemetrics. The historical performance metrics can include 720 hours ofdata for all user sessions in the grid, in an example. Any ML trainingalgorithm can be used, such as a recurrent neural network (“RNN”) orconvolutional neural network (“CNN”) algorithm. The recommended ML modelcan thoroughly study the behavior of the two-level grid to determineproblems, unlike a traditional approach of using a simple threshold todetermine whether anomalies exist.

Stages 130, 140, and 150 relate to nominating which cells to address aspart of fixing the problem identified by the ML model. The remedialactions can be achieved based on nominating an individual cell and, forthat cell, changing the antenna tilt, transmission power, and loadbalancing parameters, such as inter- and intra-frequency handover (“HO”)related parameter thresholds.

The network analysis platform can make these changes based on a periodicbasis, such as on an hourly basis, based on the time period selected inthe console. The changes themselves are intended to be macro-levelchanges that address the entire grid, such as at the entire 1 km MGRSlevel. Additional micro changes to the individual nominated cells areintended to help effectuate the macro-level changes. This can reduceshortcomings of existing technological approaches, such as resolvingsome conflicts that micro changes often have. For example, (e.g.,changing a first cell's parameters can adversely impact a neighboringcell, leading to changes to the neighboring cell that negatively impactthe first cell, and so on). Micro changes can be applied more often thanthe analyzed time period for a grid, in an example. Micro changesperformed by each individual cell are not impacted, in some examples.Instead, the network analysis platform can set an overall remediationstrategy for each grid, such as how the load of cells serving that gridcan be balanced. The network analysis platform need not run on real-timebasis, in an example.

At stage 130, when the problem is indicated, the network analysisplatform can determine offender bins within the grid based on rankingthe bins of the matrix according to KPI samples related to coverage orthroughput. A worst ranked set of bins can be the offender bins. Eachbin can be a vector of KPIs, and the KPIs for each bin can be theaverage of the worst 5% KPI samples. However, other summaries besidesthe average of the worst 5% can also be used. The bins can be rankedbased on the KPIs, with the worst ranked being identified for furtheranalysis. In one example, the worst ranked bins are given the highestranking and are identified as the offender bins.

In one example, the bins can be ranked based on each bin's average ofthe worst five percent of KPIs during the time period, referred to hereas an R value. The bins can be ranked based on R with the lowest R ashighest ranking. In that example, the bins with the lowest R can beconsidered as potential offender bins. To determine the offender bins, amaximum number of offender bins, such as five, can be applied.Therefore, the worst ranked bins can be selected as the offender bins.Since there can be multiple analyzed KPIs, i.e., RSRP and throughput,the ranking can be done for each of the KPIs. This can result in oneranking of RSRP for coverage a problem and another ranking of throughputfor throughput problem. Other factors can cause the network analysisplatform to reduce the identified offender bins. For example, if a bin'sR value for RSRP is greater than −95 dBm, this can indicate goodperformance. As a result, if the bin's R value is in that range, then itcan be excluded from the list of offender bins, in an example.Additionally, if fewer than five bins fail to meet the R criteria, thena smaller number of offender bins can be selected.

At stage 140, network analysis platform can nominate serving cells toattempt to fix. For example, for each offender bin determined in stage130, the network analysis platform can determine cells that contributeto the bin's offender status, such as by ranking the cells of therespective offender bin. In one example, the cells can be ranked basedon how many KPI samples are attributed to user sessions served by therespective cell. In one example, the ranking is based on weights.Weights can be generated based on a cell's percentage of total samplesin the bin and a percentage of worst samples from each serving cell.

In one example, a total cell weight W can be determined based on the binranking, the percentage of samples for the cell, and the percentage ofthe worst 5% samples for the cell in current bin. The three factors canbe combined together to give the weight of each cell. In one example,less than all three factors can be used to determine cell weights.

The serving cells can then be nominated based on rank across the entiregrid. If the same cell is nominated for multiple different bins, thenthe highest weight W for that cell can be used as the cell's finalweight. From the cell lists L across all the offender bins, cell weightsW can be ranked in descending order. A threshold number, such as topfive, of worst cells can be chosen as the nominated serving cells. Inother words, the nominated serving cells can be the final cells selectedto solve the problem for the grid.

The nominated serving cells can then be targeted for remedial actions.Adjusting these cells can potentially fix the problem(s) identified instage 120 for the entire grid.

Stages 130 and 140 can also operate by using different KPI values thanRSRP in other examples. For example, while RSRP is relevant forcoverage, the stages can use downlink throughput KPI samples for theuser sessions when attempting to fix a throughput problem. When the MLmodel of stage 120 indicates both a coverage and throughput problemexists, both RSRP and downlink throughput KPIs can be considered,separately or individually, and other KPI types are also possible foranalysis.

In one example, two different nominated cell lists can be created forseparate throughput and coverage problems. The nomination process can belargely the same, with different types of KPI samples being considered.These nominated cells can be the target of remedial actions.

At stage 150, the network analysis platform can apply a remedial actionto a nominated serving cell. This can include notifying anadministrative user, such as through email or a visualized alert on theGUI. Alternatively, the network analysis platform can automaticallyattempt to apply the remedial action, such as by making applicationprogramming interface (“API”) calls to interfaces for the nominatedserving cells.

The remedial action can include changing one or more of antenna tilt,transmission power, and load balancing parameters such as inter- andintra-frequency HO related parameter thresholds. These changes can causeimprovements to downlink interference, load balance of individual cells,and coverage characteristics of the cell.

The remedial action can be based on a root cause analysis. The rootcause analysis of the network analysis platform can consider whethercells are suffering from reduced throughput or coverage (i.e., lostsignal), then analyze whether these problems manifest in the form ofuplink interference, downlink interference, load imbalance, coverage, ordevice issues. The recommended remedial action can then be one or moreof changing tilt, uplink power, downlink power, load balancingparameters, or software fixes.

In one example, for each nominated cell, the network analytics platformcan choose between four different remedial cases to apply. The cases arediscussed in more detail below in connection with FIGS. 3A and 3B.

FIG. 2 is a sequence diagram of an example method for a two-levelgrid-based anomaly area identification and solution nomination in aradio access network. At stage 210, an administrator can use a consoleGUI of the network analysis platform to set parameters that guideoperation of the problem detection. The GUI can be generated by awebserver and set variables used by the network analysis platform. Theupdated parameters can be sent to the network analysis platform forlocal storage.

For example, the GUI can include a first option to adjust the timeperiod for the KPI samples. The time period can dictate the frequencywith which the grid is used with the ML model to identify a problem. Thetime period can be, for example, hourly, fifteen minutes, or daily.

As another example, the GUI can include a second option to adjust a sizegranularity of bins in the grid. When the grid is an MGRS matrix, theGUI can allow the user to specify the boundaries of the matrix and thesize of the bins, for example. For example, the bins can adjust in sizeby changing the number of digits used in the MGRS ID for each bin. Thegranularity can be manipulated to fit the training capabilities for MLmodels, with more granular being more specific in problem diagnosis butpotentially requiring more samples to train. Additionally, the griditself can be selected and sized on the GUI in an example. The gridselected for analysis can be part of a larger overall matrix. Forexample, a one square km grid can be selected within the matrix, in anexample. Alternatively, the matrix can be automatically divided intogrids of a selected size, such that multiple grids are independentlyanalyzed as part of the network analysis.

In yet another example, the console GUI can include a third option forconfiguring which KPI samples to use in the analysis. This can includeselecting which types of KPIs to map to the grid and which bottompercentiles to analyze with the ML model or when ranking the bins andcells. Several different selections can be presented for selecting thesedifferent criteria.

In another example, the parameters set at stage 210 can be updatedautomatically. For example, if a problem persists after remedial fixesare applied, different KPI types can be analyzed or a different gridgranularity can be selected.

At stage 215, KPI samples can be received at the network analysisplatform from various cells within the mobile network. Stage 215 can beongoing in an example, with KPI samples being received at periodicintervals or constantly queued from reporting cells. The telemetry datacan be captured and measured in real time by base stations, which sendthe telemetry data to the network analysis platform. The networkanalysis platform itself can perform analysis in non-real-time based onthe collected KPI samples, which can be arranged according to timeinterval.

At stage 220, the network analysis platform can map the KPI samples tothe MGRS grid. This can include assigning the samples to the bins of thegrid. This can be done based on location information for each sample orfor a user whose session is associated with the samples. In one example,the KPI samples themselves are created to include coordinates. This caneven include creating KPI samples that include MGRS coordinates forminimizing processing required by the network analysis platform.

Some subset of the KPI samples can be used as inputs to the ML model atstage 225. For example, a bottom five percent of samples (or otherselected threshold) of particular KPI types can be averaged andconverted into a vector for the bin. For example, sample averages for aparticular hour can be assigned to each bin and the grid of vectors canbe provided as an input to the ML model 225. The ML model 225 can bepretrained to detect problems within the grid based on the KPI types andgrid size.

At stage 230, the ML model can identify a problem with one or both ofcoverage and throughput for the grid. The problem can be indicated as analert on the GUI console for the network analysis platform, in anexample. This can begin a process of root cause analysis for purposes ofperforming a remedial action, in an example.

At stage 235, root cause analysis can include nominating which cells tofix. This can include ranking offender bins and ranking cellcontributions to the bins' relatively poor rankings (as offenders). Forexample, the techniques described with respect to stages 130 and 140 ofFIG. 1 can be applied here.

The nominated cells and suggested remedial actions can then bevisualized on the console GUI at stage 250, in an example. This canallow an administrator to oversee and approve the remedial actions, inan example. Alternatively, the remedial actions can be automaticallyimplemented without human approval. Different nominated cells can havedifferent suggested remedial actions. Additionally, the remedial actionscan include changing settings on neighboring cells that may be causinginterference to a serving cell.

At stage 240, the remedial actions can be applied. The remedial actioncan include changing one or more of antenna tilt, transmission power,and load balancing parameters such as inter- and intra-frequency HOrelated parameter thresholds. These changes can cause improvements todownlink interference, load balance of individual cells, and coveragecharacteristics of the cell at stage 245.

The remedial action chosen for a particular cell can be based on whetherthat cell is suffering from reduced throughput or coverage (i.e., lostsignal). From there, the network analysis platform can determine whetherthe problems manifest in the form of uplink interference, downlinkinterference, load imbalance, coverage, or other device issues. Therecommended remedial action can then be one or more of changing tilt,uplink power, downlink power, load balancing parameters, or softwarefixes.

In one example, for each nominated cell, the network analytics platformcan choose between four different remedial cases to apply. The cases arediscussed in more detail below in connection with FIGS. 3A and 3B.

FIG. 3A is an example flow chart with stages for performing a remedialaction when the network analysis platform detects a problem withthroughput or coverage within the grid. The network analysis platformcan first determine whether the problem is reduced throughput at stage310, a coverage issue at stage 312, neither, or both. The determinationcan be made based on the ML model output, in an example.

The dashed lines lead from the problems at stages 310 and 312 topotential root causes 314, 316, 318, 320, 322. Based on the KPI samplesattributed to the nominated cells, one or more of these root causes 314,316, 318, 320, 322 can be identified. In one example, the networkanalysis platform can rely on an additional neural network ML model toidentify the root cause. The root cause can also be identified by the MLmodel that identifies the problem for the grid. For example, the MLmodel can output one or more root causes along with the problemidentified based on the using the grid vectors as an input.

From there, four different remediation cases 324, 326, 328, 330 arepossible. The dashed lines from the root causes 314, 316, 318, 320, 322to the remediation approaches 324, 326, 328, 330 show which root causes314, 316, 318, 320, 322 can be linked to which remediation approaches324, 326, 328, 330.

The first remediation approach 324 includes changing the cell's antennatilt or transmission power to improve coverage. A second remediationapproach 328 can include changing the cell's antenna tilt ortransmission power to improve SINR. In one example, load balancingincludes changing load balancing parameters at the cell and potentiallya neighboring cell to change how user sessions are load balanced betweencells. A third approach 326 is to change load balancing parameters, suchas parameters that dictate when to pass a user session to another cell.The fourth remediation approach 330 can include recommending a softwarefix or additional sites.

In one example, the tilt can be restricted to at least two degrees toprevent overshooting. If uptilt to less than two degrees does not fixthe problem by itself, the network analysis platform can attempt to alsoadjust transmission power. Because any change will impact a neighboringcell in some way, the neighboring cell also be considered.

The remediation approaches 324, 326, 328, 330 are discussed in moredetail with respect to FIG. 3B. FIG. 3B is an example flow chart withstages for performing a remedial action when the network analysisplatform detects a problem with throughput or coverage within the grid.

At stage 350, a problem is detected by the ML model. The problem can beone or both of throughput and coverage. The root cause can also bedetermined using a neural network ML model in an example. In general,different KPI shortfalls can indicate the different root causes 314,316, 318, 320, 322 of FIG. 3A.

The first remedial case 351 can apply to a coverage problem. To fix thecoverage, at stage 361 the network analysis platform can change eitherthe tilt or increase transmission power. In one example, changing thetilt is preferred. A physical download control channel (“PDCCH”)utilization threshold PT can be defined, such as at 70%. If the PDCCHutilization of the nominated cell is greater than the threshold, thenetwork analysis platform can prevent adding additional traffic to thatnominated cell. However, if PDCCH utilization is less than PT and thecell tilt is greater than two degrees, the cell can be uptilted by onedegree. (Tilt can be counted from a horizontal level, where zero tilt isflat.) If the transmission power is less than maximum, the transmissionpower can be incremented. If transmission power is already maximized ornone of the above conditions apply, the console GUI can display an alertsuggesting that new sites be recommended.

In addition, the network analysis platform can get the nominated cell IDalong with the top few, such as three, intra-frequency neighbor cellIDs. The intra-frequency neighbor cells can be those with strongestintra-frequency neighbor among all possible neighbors. In one example,the strongest three neighbor cells are identified. Then, the networkanalysis platform can adjust the tilt (such as with the etilt parameter)of one or more of these neighbor cells. For example, when the uptilt ortransmission power is increased on a serving cell, the network analysisplatform can apply down tilt on a neighboring cell.

The second remedial case 352 can apply to a throughput problem, such aswhen SINR values are low. At stage 362, the network analysis platformcan down tilt one of the strongest ranked neighbor cells and uptilt thenominated serving cell. If the neighbor cell has already reached amaximum down tilt (e.g., etilt minus two), then the network analysisplatform can refrain from further down tilting of that cell and try todown tilt another of the ranked neighbor cells.

To determine which of the ranked neighbor cells to down tilt, thenetwork analysis platform can start with the highest ranked and see ifthe neighbor cell is also a different nominated serving cell. If so, thenetwork analysis platform can skip that neighbor cell and traverse tothe next ranked neighbor cell, performing the same check before downtilting. In this way, the network analysis platform can prevent aconflict, such as weakening a neighbor cell that itself is alreadynominated for fixing.

The third remedial case 353 can also apply to a throughput problem, whenthe ML model or other process at the network analysis platformdetermines load balancing or transmission power adjustment is needed. Ingeneral, the third remedial case 353 can be the opposite of the secondcase 352. In the third case 353, the network analysis platform candetermine that a nominated serving cell needs to offload traffic to aneighbor cell.

Therefore, at stage 361 the network analysis platform can address thethird case 353 by performing the etilt adjustment in the oppositedirection as in stage 362, applying uptilt to neighboring cells that arenot also other nominated serving cells. If the uptilt changes are notpossible, then the network analysis platform can attempt to increasetransmission power of that neighbor cell. If neither are possible, theconsole GUI can suggest adding additional sites to offload the servingcell.

The fourth remedial case 354 can also apply to a throughput problem butcan be based on a problem with load balancing parameters. Each site canhave multiple sectors, and each sector has multiple cells which are alsocalled overlays, including a low band capacity layer, a mid-to-high bandcapacity layer. At stage 365, the network analysis platform can attemptto shift traffic from the low band to the mid-to-high band within thesame sector. But if utilization in the higher band is already high, thenthe network analysis platform can attempt intra-frequency load balancingto hand off traffic to an intra-frequency neighbor.

If the nominated serving cell is low band, then the network analysisplatform can determine the PDCCH utilization difference between theserving cell and a mid-to-high band neighbor. If the difference isgreater than a threshold, such as 20%, then the network analysisplatform can adjust an inter-frequency hand off parameter, such as by 1dB. This can cause more sessions to switch over into available higherband service, reducing the load on the nominated serving cell. As withthe other remedial actions 361, 362, the HO parameters can be adjustedon neighbor cells that are not also in the list of nominated servingcells. If adjustments are not available based on the nominated cell listor the HO parameters already being maximized or minimized, the networkanalysis platform can suggest adding new sites on the console GUI.

In one example, the network analysis platform can also check forconflicts when multiple different root causes are detected. For example,if the first case 351 and second case 352 exist simultaneously, noconflict exists and the remedial actions 361 and 362 can both be carriedout. If the first case 351 and third case 353 exist simultaneously, thenthe changes associated with the third case 353 take precedent atremedial action 361. The network analysis platform can check again inthe next time period, such as one hour, to see if the changes eliminatedthe problem. If not, then additional changes for the first case 351 canbe applied.

When the first case 351 and fourth case 354 exist simultaneously, aconflict can exist. The network analysis platform can check for thisconflict at stage 364. The reason for the conflict is that increasingcoverage will also increase the load. Therefore, the serving cell willneed to off load even more traffic to a neighboring cell. In this case,the HO parameter needs to be adjusted in value at stage 363. Forexample, the HO offset needs to adjust such that more traffic can shiftto neighbor cell than in stage 365. This can offset the otherwiseconflicting changes at stage 361.

The method be applied on all grids in the network. For example, multiplegrids can span an MGRS matrix. However, because the same nominated cellcan serve areas in multiple grids, it is possible for conflicts to arisebetween remedial actions in different grids. If conflicts exist forremediating nominated cells of the multiple grids, the conflicts can behandled similarly to conflicts between adjacent bins (in some examples,bins can be sub-grids with their own sub-bins). For example, whenchanges are made on a 1 km MGRS grid basis and a conflict exists withchanges in an adjacent MGRS grid, conflict resolution rules can beimplemented by the network analysis platform. As an example rule, ifmore than one grid (e.g., 1 km MGRS) requires changing the sameparameter of the same nominated cell and the changes are in the samedirection, no conflict exists. For example, the network analysisplatform can suggest a one-degree tilt increase for both grids or a 1 dBchange in HO threshold value for both. In either case, the same targetcell can be adjusted to remediate the problem in both grids.

However, if more than one grid requires opposing changes to the sametarget cell, then the network analysis platform can decline to make anychanges to the target cell. In other words, if the network analysisplatform suggests increasing a parameter for a first grid, butdecreasing the parameter of the same nominated cell for a second grid,the network analysis platform can recognize this conflict and make nochange to the parameter. For example, analysis of a first grid cansuggest increasing etilt on a first cell by one degree, while analysisof a second grid suggest decreasing the etilt of the first cell. Thenetwork analysis platform can make a final decision to not increase ordecrease the etilt of the first cell.

FIG. 4 is an illustration of an example GUI screen 400 for settingparameters used in two-level grid-based anomaly area identification andsolution nomination in a radio access network. The GUI screen 400 can bepart of a console provided to an administrator of the network analysisplatform.

In one example, the GUI screen 400 can include visualization of thegrid, such as MGRS grid 402. In this example, the MGRS grid 402 foranalysis is a 10 by 10 collection of bins with each bin 406 representinga square area within the grid 402. The grid 402 itself is shadeddifferently than additional regions 404 of the larger MGRS matrix. Thiscan allow for analysis of a particular region by the network analysisplatform. In one example, the grid 402 for analysis can be selected ordefined by the administrative user simply moving the grid 402 or itsboundaries on a geographic map. In another example, the geographic mapis split into multiple grids 402 that are separately analyzed. Forexample, region 404 can be part of a second MGRS grid that is analyzedin addition to grid 402. The network analysis platform can recommenddividing grids of the existing MGRS matrix based on past offender bins,such that grid 402 is centered on the offenders in an attempt to reduceconflicts between grids. As shown in the example, the stars inside grid402 can represent KPIs leading to offender bins in the grid 402. Thetower icons can represent the location of cells relative to the MGRSmatrix and grid 402.

In one example, the GUI screen 400 can include an option 420 forchanging bin granularity. This can cause the grid 402 to be comprised ofmore or fewer bins. The grid 402 can also be sized smaller or larger, inan example. With an MGRS grid, available granularity can be based on theMGRS ID system, which allows for increasing granularity by adding digitsto the bin IDs.

The GUI screen 400 can also include an option 430 for setting analysisfrequency. In this example, the time period is set to hourly. However,the analysis can be in 15-minute intervals, day intervals, or othercustom intervals. This setting can dictate the time period over whichKPI samples are analyzed. Another option 440 can be used to set athreshold for which percentage of worst KPI samples are considered perbin. This setting can impact the averages provided in vector form to theML model as part of detecting a problem in the grid 402.

When the administrative user has selected the desired parameters or gridgranularity, the apply button 450 can cause any changes to take effect.The console can communicate the parameters to the network analysisplatform, which can utilize the new parameters as part of training a newML model and in detecting problems in the grid 402. The network analysisplatform can perform the analysis separately on multiple grids thatexist within the MGRS matrix.

FIG. 5 is an illustration of example system components for performingthe stages described above. In one example, the network analysisplatform 555 can execute within a service management and orchestration(“SMO”) layer 550 of an open radio access network (“O-RAN”) architecture520. The SMO layer 550 and network analysis platform 555 can execute onone or more physical hardware servers. The network analysis platform 555can act as a controller of various services. The network analysisplatform 555 can interact with the near-real-time RAN intelligentcontroller (“RIC”) 560 to make changes to nominated cells over thecloud, in an example. The O1 and A1 interfaces can assist passing theinformation explained above. These interfaces can also help withconflict resolution.

In one example, real-time data processing is not required for operationof the network analysis platform 555. The network analysis platform 555can give hourly views of the network at, for example, a 1 km MGRS scale.

The ORAN architecture 520 can operate on multiple servers and caninterface with a core network 510. The core network 510 can provideaccess controls to ensure users are authenticated for services and canroute communications over a telco network.

The ORAN architecture 520 can act as a link between the core network 510and user devices 540, such as phones and IoT devices. The KPI samplesand user data can come from the cells or from a cell trace record(“CTR”) or drive test. Location information (latitude, longitude) can beincluded. Cell data can come from performance management (“PM”)counters.

A cell can be integrated with the ORAN architecture 520 for purposes ofconnecting to the core network 510. The cell can transmit and receivesignals using an antenna 530. The antenna 530 can be tilted or poweredas described above to ensure quality connections with user devices 540.The architecture described above can assist with identifying and solvingcoverage and throughput problems in a RAN network.

Other examples of the disclosure will be apparent to those skilled inthe art from consideration of the specification and practice of theexamples disclosed herein. Though some of the described methods havebeen presented as a series of steps, it should be appreciated that oneor more steps can occur simultaneously, in an overlapping fashion, or ina different order. The order of steps presented are only illustrative ofthe possibilities and those steps can be executed or performed in anysuitable fashion. Moreover, the various features of the examplesdescribed here are not mutually exclusive. Rather any feature of anyexample described here can be incorporated into any other suitableexample. It is intended that the specification and examples beconsidered as exemplary only, with a true scope and spirit of thedisclosure being indicated by the following claims.

What is claimed is:
 1. A method for two-level grid-based anomaly area identification and solution nomination in a radio access network, comprising: mapping key performance indicator (“KPI”) samples from user sessions to individual bins of a grid, wherein the grid applies to a geographic region on a first level, the bins indicate areas within a second level of the geographic region, and the KPI samples are associated with a time period; providing values based on the mapped KPI samples as inputs to a machine learning model, wherein the machine learning model outputs an indication that a problem with coverage or throughput exists for the grid; identifying bins within the grid based on ranking the bins for low performance according to the mapped KPI samples; for each identified bin, determining contributing serving cells based on at least one of: a percentage of mapped KPI samples attributable to each serving cell, and a percentage of mapped KPI samples indicating the low performance; nominating serving cells based on ranking the contributing serving cells across the grid; and applying a remedial action to a nominated serving cell.
 2. The method of claim 1, wherein the grid is a matrix from a military grid reference system (“MGRS”).
 3. The method of claim 1, wherein a graphical user interface (“GUI”) provides at least two of: a first option to adjust the time period for the KPI samples; a second option to adjust a size granularity of bins and the grid; and a third option for configuring a bottom percentile of mapped KPI samples to analyze when ranking the bins.
 4. The method of claim 1, wherein remedial actions are applied to multiple of the nominated serving cells for fixing the problem with the grid.
 5. The method of claim 1, wherein the remedial action improves coverage by changing a tilt or power parameter of the nominated serving cell or a neighbor cell to the nominated serving cell.
 6. The method of claim 1, wherein the remedial action improves signal interference to noise ratio (“SINR”) by changing a tilt or power parameter of the nominated cell or a neighbor cell to the nominated cell.
 7. The method of claim 1, wherein the remedial action adjusts tilt or transmission power parameters to balance a load among the nominated serving cell and a neighbor cell, improving throughput of the nominated serving cell.
 8. The method of claim 1, wherein the remedial action balances a load between the nominated serving cell and at least one of an inter-frequency neighbor cell and an intra-frequency neighbor cell, improving throughput of the nominated serving cell.
 9. A non-transitory, computer-readable medium containing instructions that, when executed by a hardware-based processor, perform stages for two-level grid-based anomaly area identification and solution nomination in a radio access network, the stages comprising: mapping key performance indicator (“KPI”) samples from user sessions to individual bins of a grid, wherein the grid applies to a geographic region on a first level, the bins indicate areas within a second level of the geographic region, and the KPI samples are associated with a time period; providing values based on the mapped KPI samples as inputs to a machine learning model, wherein the machine learning model outputs an indication that a problem with coverage or throughput exists for the grid; identifying bins within the grid based on ranking the bins for low performance according to the mapped KPI samples; for each identified bin, determining contributing serving cells based on at least one of: a percentage of mapped KPI samples attributable to each serving cell, and a percentage of mapped KPI samples indicating the low performance; nominating serving cells based on ranking the contributing serving cells across the grid; and applying a remedial action to a nominated serving cell.
 10. The non-transitory, computer-readable medium of claim 9, wherein the grid is a matrix from a military grid reference system (“MGRS”).
 11. The non-transitory, computer-readable medium of claim 9, wherein a graphical user interface (“GUI”) provides at least two of: a first option to adjust the time period for the KPI samples; a second option to adjust a size granularity of bins and the grid; and a third option for configuring a bottom percentile of mapped KPI samples to analyze when ranking the bins.
 12. The non-transitory, computer-readable medium of claim 9, wherein remedial actions are applied to multiple of the nominated serving cells for fixing the problem with the grid.
 13. The non-transitory, computer-readable medium of claim 9, wherein the remedial action improves coverage by changing a tilt or power parameter of the nominated serving cell or a neighbor cell to the nominated serving cell.
 14. The non-transitory, computer-readable medium of claim 9, wherein the remedial action improves signal interference to noise ratio (“SINK”) by changing a tilt or power parameter of the nominated cell or a neighbor cell to the nominated cell.
 15. The non-transitory, computer-readable medium of claim 9, wherein the remedial action adjusts tilt or transmission power parameters to balance a load among the nominated serving cell and a neighbor cell, improving throughput of the nominated serving cell.
 16. The non-transitory, computer-readable medium of claim 9, wherein the remedial action balances a load between the nominated serving cell and at least one of an inter-frequency neighbor cell and an intra-frequency neighbor cell, improving throughput of the nominated serving cell.
 17. A system for two-level grid-based anomaly area identification and solution nomination in a radio access network, comprising: a memory storage including a non-transitory, computer-readable medium comprising instructions; and a computing device including a hardware-based processor that executes the instructions to carry out stages comprising: mapping key performance indicator (“KPI”) samples from user sessions to individual bins of a grid, wherein the grid applies to a geographic region on a first level, the bins indicate areas within a second level of the geographic region, and the KPI samples are associated with a time period; providing values based on the mapped KPI samples as inputs to a machine learning model, wherein the machine learning model outputs an indication that a problem with coverage or throughput exists for the grid; identifying bins within the grid based on ranking the bins for low performance according to the mapped KPI samples; for each identified bin, determining contributing serving cells based on at least one of: a percentage of mapped KPI samples attributable to each serving cell, and a percentage of mapped KPI samples indicating the low performance; nominating serving cells based on ranking the contributing serving cells across the grid; and applying a remedial action to a nominated serving cell.
 18. The system of claim 17, wherein the grid is a matrix from a military grid reference system (“MGRS”).
 19. The system of claim 17, wherein a graphical user interface (“GUI”) provides at least two of: a first option to adjust the time period for the KPI samples; a second option to adjust a size granularity of bins and the grid; and a third option for configuring a bottom percentile of mapped KPI samples to analyze when ranking the bins.
 20. The system of claim 17, wherein the remedial action improves coverage by changing a tilt or power parameter of the nominated serving cell or a neighbor cell to the nominated serving cell. 